35 research outputs found

    Population genetic inference of demographic processes in the African Wild Silk Moth, Gonometa postica (Lasiocampidae)

    Get PDF
    The African Wild Silk moths (Gonometa spp., Lasiocampidae) are species that are presently of particular economic interest in southern Africa. Both Gonometa postica and G. rufobrunnea, two species of African Wild Silk moth native to southern Africa, have been shown to possess a silk fibre of exceptional quality. A small-scale cottage industry utilizing the silk of Gonometa species currently exists in southern Africa, yet a consistent complaint is the lack of supply of cocoons. The Gonometa species in southern Africa have been shown to exhibit large inter-annual population fluctuations. However, it is uncertain whether eruptions are only the result of local populations experiencing ideal conditions or whether current eruptions are initiated by dispersal of individuals from eruptive populations in previous generations. A second observation, regarding eruptions, is that they are patchily distributed at both the local (within outbreaks) and regional scale (across southern Africa). In this thesis I have studied population eruptions through distribution analysis of three years of presence/absence data, and through spatial and temporal population genetic analysis. The analysis of population genetic data allows the inference of population demographic parameters such as population size fluctuations and migrations. In particular, the use of microsatellite markers allows a high-resolution analysis of the connectivity of populations, and provides signal of population size fluctuations. I utilise both mitochondrial DNA control region sequences and polymorphic microsatellite loci to make inferences of population processes in G. postica, using a combination of both analytical and simulation model analysis approaches. The results, in general, indicate that dispersal of moths across South Africa is extensive. These results are further considered in light of the effects of population size fluctuations on spatial genetic pattern, where the potential exists for unstable population demography to influence the inference of dispersal from population genetic data. The population genetic analyses presented here allow the inference of the extent of a local population/outbreak, and the degree of movement between local populations. Given that a large-scale population dynamics project based on G. postica is currently under development, the results determine the geographical extent at which the population dynamics study should be conducted. Furthermore, the population genetics data generated will contribute to the construction of a population dynamics model, including abiotic and biotic variables, which will allow a better understanding of eruptions in this species.Thesis (PhD (Genetics))--University of Pretoria, 2006.Geneticsunrestricte

    Evolutionary Constraints Acting on DDX3X Protein Potentially Interferes with Rev-Mediated Nuclear Export of HIV-1 RNA

    Get PDF
    Differential host-pathogen interactions direct viral replication in infected cells. In HIV-1 infected cells, nuclear export of viral RNA transcripts into cellular cytoplasm is governed by interaction of HIV-1 Rev, Exportin-1 (CRM-1) and DDX3X. Knock down of DDX3X has been shown to drastically impair HIV replication. Here we show that evolutionary forces are responsible for demarking previously unidentified critical functionally important residues on the surface of DDX3X. Using computational approaches, we show that these functional residues, depending on their location, are capable of regulating ATPase and RNA helicase functions of DDX3X. The potential of these residues in designing better blockers against HIV-1 replication was also assessed. Also, using stepwise docking simulations, we could identify DDX3X-CRM-1 interface and its critical functional residues. Our data would help explain the role of DDX3X in HIV-1 Rev function with potential to design new intervention strategies against HIV-1 replication

    Frequent toggling between alternative amino acids is driven by selection in HIV-1

    Get PDF
    Author Summary Viruses, such as HIV, are able to evade host immune responses through escape mutations, yet sometimes they do so at a cost. This cost is the reduction in the ability of the virus to replicate, and thus selective pressure exists for a virus to revert to its original state in the absence of the host immune response that caused the initial escape mutation. This pattern of escape and reversion typically occurs when viruses are transmitted between individuals with different immune responses. We develop a phylogenetic model of immune escape and reversion and provide evidence that it outperforms existing models for the detection of selective pressure associated with host immune responses. Finally, we demonstrate that amino acid toggling is a pervasive process in HIV-1 evolution, such that many of the positions in the virus that evolve rapidly, under the influence of positive Darwinian selection, nonetheless display quite low sequence diversity. This highlights the limitations of HIV-1 evolution, and sites such as these are potentially good targets for HIV-1 vaccines

    Benchmarking multi-rate codon models

    Get PDF
    CITATION: Delport, W. et al. 2010. Benchmarking multi-rate codon models. PLoS ONE, 5(7): e11587, doi:10.1371/journal.pone.0011587.The original publication is available at http://journals.plos.org/plosoneThe single rate codon model of non-synonymous substitution is ubiquitous in phylogenetic modeling. Indeed, the use of a non-synonymous to synonymous substitution rate ratio parameter has facilitated the interpretation of selection pressure on genomes. Although the single rate model has achieved wide acceptance, we argue that the assumption of a single rate of non-synonymous substitution is biologically unreasonable, given observed differences in substitution rates evident from empirical amino acid models. Some have attempted to incorporate amino acid substitution biases into models of codon evolution and have shown improved model performance versus the single rate model. Here, we show that the single rate model of non-synonymous substitution is easily outperformed by a model with multiple non-synonymous rate classes, yet in which amino acid substitution pairs are assigned randomly to these classes. We argue that, since the single rate model is so easy to improve upon, new codon models should not be validated entirely on the basis of improved model fit over this model. Rather, we should strive to both improve on the single rate model and to approximate the general time-reversible model of codon substitution, with as few parameters as possible, so as to reduce model over-fitting. We hint at how this can be achieved with a Genetic Algorithm approach in which rate classes are assigned on the basis of sequence information content. © 2010 Delport et al.http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0011587Publisher's versio

    Correcting the Bias of Empirical Frequency Parameter Estimators in Codon Models

    Get PDF
    Markov models of codon substitution are powerful inferential tools for studying biological processes such as natural selection and preferences in amino acid substitution. The equilibrium character distributions of these models are almost always estimated using nucleotide frequencies observed in a sequence alignment, primarily as a matter of historical convention. In this note, we demonstrate that a popular class of such estimators are biased, and that this bias has an adverse effect on goodness of fit and estimates of substitution rates. We propose a “corrected” empirical estimator that begins with observed nucleotide counts, but accounts for the nucleotide composition of stop codons. We show via simulation that the corrected estimates outperform the de facto standard estimates not just by providing better estimates of the frequencies themselves, but also by leading to improved estimation of other parameters in the evolutionary models. On a curated collection of sequence alignments, our estimators show a significant improvement in goodness of fit compared to the approach. Maximum likelihood estimation of the frequency parameters appears to be warranted in many cases, albeit at a greater computational cost. Our results demonstrate that there is little justification, either statistical or computational, for continued use of the -style estimators

    Experimental evidence indicating that mastreviruses probably did not co-diverge with their hosts

    Get PDF
    Background. Despite the demonstration that geminiviruses, like many other single stranded DNA viruses, are evolving at rates similar to those of RNA viruses, a recent study has suggested that grass-infecting species in the genus Mastrevirus may have co-diverged with their hosts over millions of years. This "co-divergence hypothesis" requires that long-term mastrevirus substitution rates be at least 100,000-fold lower than their basal mutation rates and 10,000-fold lower than their observable short-term substitution rates. The credibility of this hypothesis, therefore, hinges on the testable claim that negative selection during mastrevirus evolution is so potent that it effectively purges 99.999% of all mutations that occur. Results. We have conducted long-term evolution experiments lasting between 6 and 32 years, where we have determined substitution rates of between 2 and 3 × 10 -4substitutions/site/year for the mastreviruses Maize streak virus (MSV) and Sugarcane streak Réunion virus (SSRV). We further show that mutation biases are similar for different geminivirus genera, suggesting that mutational processes that drive high basal mutation rates are conserved across the family. Rather than displaying signs of extremely severe negative selection as implied by the co-divergence hypothesis, our evolution experiments indicate that MSV and SSRV are predominantly evolving under neutral genetic drift. Conclusion. The absence of strong negative selection signals within our evolution experiments and the uniformly high geminivirus substitution rates that we and others have reported suggest that mastreviruses cannot have co-diverged with their hosts. © 2009 Harkins et al; licensee BioMed Central Ltd

    Evolutionary distances in the twilight zone -- a rational kernel approach

    Get PDF
    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

    CodonTest: Modeling Amino Acid Substitution Preferences in Coding Sequences

    Get PDF
    Codon models of evolution have facilitated the interpretation of selective forces operating on genomes. These models, however, assume a single rate of non-synonymous substitution irrespective of the nature of amino acids being exchanged. Recent developments have shown that models which allow for amino acid pairs to have independent rates of substitution offer improved fit over single rate models. However, these approaches have been limited by the necessity for large alignments in their estimation. An alternative approach is to assume that substitution rates between amino acid pairs can be subdivided into rate classes, dependent on the information content of the alignment. However, given the combinatorially large number of such models, an efficient model search strategy is needed. Here we develop a Genetic Algorithm (GA) method for the estimation of such models. A GA is used to assign amino acid substitution pairs to a series of rate classes, where is estimated from the alignment. Other parameters of the phylogenetic Markov model, including substitution rates, character frequencies and branch lengths are estimated using standard maximum likelihood optimization procedures. We apply the GA to empirical alignments and show improved model fit over existing models of codon evolution. Our results suggest that current models are poor approximations of protein evolution and thus gene and organism specific multi-rate models that incorporate amino acid substitution biases are preferred. We further anticipate that the clustering of amino acid substitution rates into classes will be biologically informative, such that genes with similar functions exhibit similar clustering, and hence this clustering will be useful for the evolutionary fingerprinting of genes

    Long-Branch Attraction Bias and Inconsistency in Bayesian Phylogenetics

    Get PDF
    Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias—which is apparent under both controlled simulation conditions and in analyses of empirical sequence data—also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages—that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis

    9-Genes Reinforce the Phylogeny of Holometabola and Yield Alternate Views on the Phylogenetic Placement of Strepsiptera

    Get PDF
    Background: The extraordinary morphology, reproductive and developmental biology, and behavioral ecology of twisted wing parasites (order Strepsiptera) have puzzled biologists for centuries. Even today, the phylogenetic position of these enigmatic “insects from outer space” [1] remains uncertain and contentious. Recent authors have argued for the placement of Strepsiptera within or as a close relative of beetles (order Coleoptera), as sister group of flies (order Diptera), or even outside of Holometabola.Methodology/Principal Findings Here, we combine data from several recent studies with new data (for a total of 9 nuclear genes and ∼13 kb of aligned data for 34 taxa), to help clarify the phylogenetic placement of Strepsiptera. Our results unequivocally support the monophyly of Neuropteroidea ( = Neuropterida + Coleoptera) + Strepsiptera, but recover Strepsiptera either derived from within polyphagan beetles (order Coleoptera), or in a position sister to Neuropterida. All other supra-ordinal- and ordinal-level relationships recovered with strong nodal support were consistent with most other recent studies. Conclusions/Significance: These results, coupled with the recent proposed placement of Strepsiptera sister to Coleoptera, suggest that while the phylogenetic neighborhood of Strepsiptera has been identified, unequivocal placement to a specific branch within Neuropteroidea will require additional study.Organismic and Evolutionary Biolog
    corecore